Patient satisfaction and value based purchasing in hospitals, Odisha, India

Abstract Objective To examine how a general inpatient satisfaction survey functions as a hospital performance measure. Methods We conducted a mixed-methods pilot study of the Hospital Consumer Assessment of Health Providers and Systems survey in Odisha, India. We divided the study into three steps: cognitive testing of the survey, item testing with exploratory factor analysis and content validity indexing. Cognitive testing involved 50 participants discussing their interpretation of survey items. The survey was then administered to 507 inpatients across five public hospitals in Odisha, followed by exploratory factor analysis. Finally, we interviewed 15 individuals to evaluate the content validity of the survey items. Findings Cognitive testing revealed that six out of 18 survey questions were not consistently understood within the Odisha inpatient setting, highlighting issues around responsibilities for care. Exploratory factor analysis identified a six-factor structure explaining 66.7% of the variance. Regression models showed that interpersonal care from doctors and nurses had the strongest association with overall satisfaction. An assessment of differential item functioning revealed that patients with a socially marginalized caste reported higher disrespectful care, though this did not translate into differences in reported satisfaction. Content validity indexing suggested that discordance between experiences of disrespectful care and satisfaction ratings might be due to low patient expectations. Conclusion Using satisfaction ratings without nuanced approaches in value-based purchasing programmes may mask poor-quality interpersonal services, particularly for historically marginalized patients. Surveys should be designed to accurately capture true levels of dissatisfaction, ensuring that patient concerns are not hidden.


Introduction
3][4] The government has focused on the quality of care covered through the scheme, including patient satisfaction as a key quality metric in several accountability programmes. 5,6A proposed nationwide programme would formally tie hospital performance to payment with up to 15% of reimbursement depending on the quality of services delivered. 7Satisfaction is the programme's primary proposed measure of patient-centred care, similar to many value-based purchasing programmes in high-income countries that incentivize high-quality care by linking hospital payments to performance. 8Hence, poor performance on patient satisfaction measures may represent a substantial financial risk for hospitals.
The Ministry of Health and Family Welfare of India has long prioritized measuring patients' satisfaction with secondary and tertiary care.For example, Mera Aspataal (My Hospital) is a health ministry digital platform used to capture patient feedback on services received from both public and private health facilities. 9To develop this platform, the health ministry used a review of validated patient surveys. 6Mera Aspataal data have informed three policy efforts: a public reporting programme, the national hospital accreditation programme, and a results-based incentives effort focused on hospital cleanliness and physical infrastructure. 6Alternate sources of information, such as insurance claims data, on the quality of health services delivered in inpatient settings across India are scarce. 10,11However, the use of patient satisfaction measures within payment programmes has been controversial 8 and there are debates on how best to interpret and value satisfaction ratings. 12,13Implicit in any survey-based measure is the assumption that tools are consistently understood by the patient and that variation represents the underlying construct being assessed, as opposed to differences in how people understand or interpret a concept or tool. 14Critics argue that due to information asymmetry, some patients may rate the superficial aspects of the visit (for example, an imposing lobby) rather than the technical or interpersonal quality of care provided by health workers. 15This issue may be particularly relevant as low-and middle-income countries improve access to hospitalbased care, and newly insured patients may use secondary and tertiary services for the first time. 2,16While the health ministry already prioritizes patient satisfaction, we lack an in-depth understanding of how patients understand and value aspects of the care interaction, and how those understandings inform satisfaction reporting in the context of a value-based purchasing programme. 7o better understand how satisfaction ratings function within an Indian inpatient setting, we conducted a pilot study using a comprehensive survey tool that assesses both patients' experiences with a given clinical interaction and their overall satisfaction rating.Considering the proposed value-based purchasing programme, we posed the following research questions: what aspects of patient experience do patients value when rating their satisfaction with care?Does the tool function

Methods
We conducted a mixed-methods assessment of a comprehensive patient experience survey tool, focusing on how patients report overall satisfaction with general inpatient care. 7We employed methods similar to those used in the development of the tool (Table 1). 17e divided the study into three steps: cognitive testing of the survey; item testing and exploratory factor analysis; and content validity indexing.We built on prior work on patient satisfaction in Indian clinical settings. 5,18We used the Hospital Consumer Assessment of Health Providers and Systems survey, due to its use in the nationwide valuebased purchasing programme in the United States of America 19 and its relevance to India's proposed programme. 7][22][23][24] In India, the tool and its derivatives have been used to assess hospital qual-ity and inform digital health platforms. 6he survey includes questions assessing aspects of the patients' experience across six domains: interpersonal care from nurses; interpersonal care from doctors; the hospital environment; general experience; after-discharge care; and understanding of care. 25These patient experience questions employ a fourpoint Likert scale, and additional questions collect demographic information, such as age and gender.

Step 1
7][28][29] In this assessment, respondents discussed what each survey item meant to them with the goal of exploring the processes by which respondents answer survey questions.We followed the protocol developed for the Hospital Consumer Assessment of Health Providers and Systems survey. 17Participants included 50 convenience-sampled Odiaspeaking individuals, 27 women and 23 men (gender was self-reported).We conducted the cognitive testing in Bhubaneswar, India, with all assessments in Odia, and clarifying discussions in Odia, Hindi and English.During a day-long session, participants reviewed each survey question in full, working in focus groups of 7 to 12 individuals to discuss their understanding of each question.We reimbursed the individuals for their participation.We used scripted probes to elicit additional insights into cognitive processes and conceptual equivalence in processing survey items. 30e used deductive qualitative analysis to categorize identified issue types.

Step 2
We administered the Odia-translated Hospital Consumer Assessment of Health Providers and Systems survey to patients at the time of discharge who had been hospitalized for at least 24 hours.We sampled five public hospitals across Odisha from purposively selected districts.Districts were first grouped according to administrative units, then selected to represent the diversity of the state in terms of tribal population, urbanization, coastal and mining areas, which are believed to influence health, health-care utilization and healthrelated expenditure.For each hospital, we surveyed approximately 100 patients  When I left the hospital, I had a good understanding of the things I was responsible for in managing my health.

Understand purpose of medications
When I left the hospital, I clearly understood the purpose for taking each of my medications?
No issues raised NA NA: not applicable a Construct issues were raised when the item was understood differently than its intended construct.Information issues were raised when there was unclear or inadequate information for a patient to answer the question reliably.Relevance issues were when there was something about the question that raised concern, e.g.

Research
(20 female obstetrics inpatients, 40 general female and male inpatients each) with an average survey duration of 35 minutes.When the number of patients being discharged exceeded the number of patients the enumerators were able to survey, we used a stratified random sampling strategy with a list frame approach to reduce bias.We set the target sample to 500 respondents, which exceeds recommendations for quantitative validation involving patients (250-350 patients) 31 and meets the threshold of very good for factor analysis. 32ith the resulting survey data, we conducted an exploratory factor analysis using principal-component factors (assuming no unique factors), and calculated the average of all correlations between each item and the total score (Cronbach's α).Additionally, we ran three models examining the relationship between individual survey items and overall patient satisfaction.Model I is an unadjusted bivariate ordinary least squares regression where overall satisfaction is the dependent variable, and each patient experience survey item is treated as a separate independent variable.Model II adds the patient's age and gender, as well as variables relevant to clinical complexity: if the patient was admitted through the emergency department; the patient's self-reported rating of health; length of stay; and facility type.Model III adds variables relevant to the interview: interviewer ID and an enumerator rating of interview privacy.Finally, we assessed differential item functioning by disaggregating results by caste, assessing differences in means with a two-sample t-test, and producing a Spearman's rank correlation coefficient for each subgroup to assess the strength of the relationship between exposure to disrespectful care and odds of reporting dissatisfaction.Dissatisfaction is shown as an unweighted proportion, with the four most negative response options (of 10) combined to generate one negative rating.

Step 3
To assess the degree to which questionnaire items constitute an adequate operational definition of our construct of interest, 33 that is, patients' overall satisfaction, we used item-level content validity indexing. 21We interviewed 15 individuals, purposively sampled across three categories -patients, health workers and experts.Patients were people familiar with public hospital care in Odisha and included hospital patients on the day of discharge; health workers were currently providing clinical care in Odisha; and experts were researchers experienced in collecting patient data from inpatient settings in Odisha.Each interview was in-person and lasted approximately one hour.The interviews involved providing verbal instructions on how to use the Likert scale (1: not relevant; 2: somewhat relevant; 3: relevant; and 4: highly relevant) to evaluate the relevance of survey items, followed by questions to explain why they did, or did not, think the item was relevant.Two separate scores were captured: (i) the item's relevance to patient satisfaction; and (ii) the item's relevance given the clinical setting.By allowing interviewees to provide two distinct scores, we were able to address concerns regarding care expectations identified during cognitive testing.This approach helped us better distinguish whether low ratings were due to concerns with the item's relevance to patient satisfaction, or other factors, such as feasibility and structural constraints in the study setting.

Disaggregating expectations
Finally, to outline policy-relevant implications of this work, we used Thompson and Sunol's framework to organize sources of variation into four categories: ideal expectations, predicted expectations, normative expectations and patient expression. 34

Results
Participants in the cognitive testing surfaced several fundamental concerns.They flagged six out of 18 questions as having relevance issues to the Odisha inpatient setting.These issues centred around responsibility for care.For example, families, not health workers, may be responsible for cleanliness.Furthermore, participants thought that doctors were responsible for communicating clinical information, but did not think they were responsible for explaining the information.These concerns informed conversations about which tasks were the responsibilities of health-care professionals (Table 2).
The exploratory factor analysis yielded six eigenvalues greater than 1, indicating a six-factor structure.These results explained 66.7% of the variance within the model.All Cronbach's α values exceeded the threshold of 0.7.Uniqueness at the item-level, variance not shared with other variables, ranged from 17.1% (understand responsibilities) to 55.6% (doctors listen carefully).Regression models revealed that the hospital environment category had the weakest association with overall satisfaction (Model III coefficient: 0.23), whereas interpersonal care from doctors and nurses had the strongest association (Model III coefficients: 0.76 and 0.70, respectively; Table 4).
Disaggregating results by patient characteristics, we identified differential functioning of survey items based on caste.Patients who identified as part of a scheduled caste, otherwise backward class or scheduled tribe were significantly more likely to report receiving disrespectful care compared to patients with no marginalized class designation (P-value: > 0.05; Fig. 1; Table 5).In con-trast, there was no statistical difference in reporting dissatisfaction between the groups.Only patients who identified as part of an otherwise backward class had a significant correlation between exposure to disrespectful care and reporting dissatisfaction (ρ: 0.19; P-value: 0.02).Moreover, all values fall well below the 15% satisfaction threshold set within the proposed value-based purchasing Notes: the proposed value-based purchasing programme in India sets an initial threshold of 85% satisfaction (15% dissatisfaction).We combined the four most negative response options (of 10) to generate a combined negative rating.We used this interpretation of dissatisfaction because the satisfaction ratings in India's proposed value-based purchasing programme will be evaluated using a 5-point Likert scale of which the two least favourable responses will be combined to a negative rating.Difference is assessed with a two-sided t-test comparing to the base group, individuals with no historically marginalized designation.Finally, our content validity indexing results suggest that reporting discordance (that is, experiencing disrespectful care but not reporting dissatisfaction) may be due to low expectations rather than a difference in what patients value.When participants were asked about item relevance, hospital environment relevance scored lower (Fig. 2) than relevance to patients' satisfaction in 13 of 18 questions.These results align with cognitive testing results; for example, participants valued doctors listening carefully, but did not expect this to occur in practice because they did not believe it was a physician's responsibility within the Odisha inpatient setting.
Interviews revealed that understandings of clinical responsibilities and corresponding expectations informed patients' overall ratings.For example, a patient participant stated: "I do feel the doctors were disrespectful, but they are the boss and this is how it is, no?So I think disrespect is important to me and my family, but if this is the same treatment I got last time, why complain?This is why my [satisfaction] score is still high." These pilot study findings raise concerns regarding the use of an overall satisfaction rating within provider payment programmes and how we interpret traditional quantitative approaches to validation, which may assume low item functioning means low importance to the patient or satisfaction.Potential sources of variation in patient satisfaction ratings and considerations for value-based purchasing policies are presented in Table 6.These sources suggest a need to consider predicted expectations in addition to other sources of variation.

Discussion
In this pilot study, we find aspects of the care interaction beyond the physical environment, such as the quality of interpersonal care, had a strong relationship with overall satisfaction.However, these results raise concerns for the use of satisfaction ratings within a nationwide performance policy.Observed differences in care ratings may not reflect true differences in patients' satisfaction, which may vary between sociocultural groups.These findings are timely as the Indian government considers using satisfaction ratings to hold hospitals accountable to patients.
Satisfaction ratings, as a single metric, are appealing in that they theoretically capture a wide range of underlying preferences.Conversely, absent of clini-Fig.2. Mean content validity indexing scores assessing items' relevance to patient satisfaction and hospital environment, Odisha, India, 2020 0.0 0.5

Research
cal expertise, patients may place undue value on more superficial aspects of the care interaction -aspects more subject to manipulation to improve ratings. 35ontrary to this concern, we found the physical environment had a weak relationship with satisfaction.Patients did appear to value interpersonal aspects of care, for example, being listened to carefully and having care explained adequately.Even when examining questions that did not perform well in the factor analysis or regression models, such as receipt of post-discharge guidance, content validity indexing suggested this guidance was valued, but participants did not anticipate it to occur in practice.Traditionally, in tool validation studies, low item performance in quantitative approaches indicates that the item is not an important driver of patient satisfaction.As a result, the item may be excluded.However, our results indicate that low coefficients may result from low predicted expectations rather than low ideal expectations.The proposed value-based purchasing programme sets an 85% satisfaction rating threshold, with facilities scoring below facing reduced health insurance scheme reimbursement. 7In our study, despite a high proportion of respondents reporting disrespectful care, reimbursement would not be affected since dissatisfaction ratings fell well below 15%.As such, the currently designed programme may not adequately surface low-quality interpersonal care provided to marginalized patients.This type of variation in reporting, which results from differences in predicted expectations, is problematic particularly if certain patients or groups of patients have been systematically subjected to lower quality of care than others.Different thresholds for reporting satisfaction raise concern for the use of overall ratings within value-based purchasing. 36Many public reporting and payment programmes treat satisfaction as a stand-alone measure, which is both a feasible and simple approach, particularly if variation results from differences in ideal expectations.However, this approach may fail to surface low-quality interpersonal care experienced by individuals unlikely to report overall dissatisfaction -either due to low predicted expectations or issues of expression.Scheduled tribe patients, for example, may have lower expectations of the system due to experiences of disrespect.Furthermore, patients with higher education may have unreasonable predicted expectations of the health system and/or a lower threshold for the expression of dissatification. 37Researchers developing the World Health Surveys coined the term universally legitimate expectations, which refers to a norma- Addressing variation that results from differences in normative expectations may include the following: -Pair subjective satisfaction ratings with more objective assessments of what a patient is experiencing during a given clinical interaction (that align with normative guidance) and look for discordance in patient ratings, that is, when patients give positive ratings to potentially inadequate care b -Due to low and variable thresholds for reporting dissatisfaction when exposed to low quality care, do not use a satisfaction rating to trigger sub-items, which are sometimes only posed to dissatisfied patients Expression Expression is how patients convey or report their satisfaction with care to others, which may differ for patients regardless of ideal, predicted, or normative expectations of care and inform reporting bias, c that is, how satisfaction is expressed may differ among patients with a similar level of true satisfaction Addressing variation that results from differences in expression may include the following: -Consider the addition of variables within surveys used for value-based purchasing that may inform reporting bias.For example, interview privacy and interviewer ID.Consider these factors when analysing data to address underreporting, which may be more prevalent for marginalized patients.
For example, being yelled at by a provider is generally seen as unacceptable by both national and international standards.It is important to understand if patients consistently give positive feedback to such care, as this helps ensure that these forms of poor-quality care are challenged, particularly among marginalized patients.c Thomson & Sunol 34 include a related concept, which they call "unformed expectations, " which is when individuals are unable to articulate their expectations because they do not have expectations, have difficulty expressing their expectations or do not wish to reveal their expectations due to fear, anxiety or conforming to social norms.Bull World Health Organ 2024;102:509-520| doi: http://dx.doi.org/10.2471/BLT.24.290519 Liana Woskie et al.
Patient satisfaction ratings, India tive set of expectations. 37Accordingly, we provide actionable considerations for improving satisfaction ratings within value-based purchasing programmes (Table 6).This work extends the existing literature assessing patient experience and satisfaction in Indian clinical settings. 5,38,39We build on this work by focusing on general inpatient care, instead of specific conditions or specialties, and consider policy applications given the proposed value-based purchasing programme.While some studies have used the Hospital Consumer Assessment of Health Providers and Systems tool in India as an outcome measure, 40 we were unable to find any documentation of formal adaptation or pre-testing processes that might be useful in informing the tool's use in payment policies.Our work also extends the patient vignette literature, which aims to understand differences in how individuals judge care for a fixed clinical example. 41,42This literature exposes differences in ratings based on patient characteristics, but cannot disentangle why ratings differ.By using a formative mixed-methods approach, we were able to assess patients' values and expectations.
This study has several limitations.First, the sample size is small and we lacked a reliable sampling frame.For example, due to the small sample, we were unable to examine how patient characteristics interact with one another.However, the results and concerns raised should inform larger studies.Second, we conducted this pilot study in a rural state with a large tribal population, which may pose challenges to generalizing these findings.However, researchers have estimated that the largest increases in hospital utilization will likely occur in states like Odisha, and we lack research on survey tools that assess health system performance in the state. 435][46] For example, the likelihood of reporting disrespectful or abusive delivery of care in the United Republic of Tanzania increased nearly 10 percentage points in a post-discharge survey compared to an exit interview. 47owever, almost half of the women in our study had at most a primary school education, which made the enumerators administer the tool verbally.In addition, only 82.1% (416/507) of patients could provide a phone number and for 70.0%(291/416) of them, the phone belonged to a family member or neighbour.These findings reaffirmed the reliance on exit interviews as the most practical method.The limitation of using an exit interview tool motivated us to adjust for interview characteristics in one of our regression models.Finally, the sample sizes for the cognitive testing and content validity indexing are small and not necessarily representative of the final populations that would be surveyed.In our study, the sample sizes exceeded those published in the pre-testing of the Hospital Consumer Assessment of Health Providers and Systems tool in 2005 (cognitive testing: 41 versus 50 participants; and content validity indexing: 12 versus 15 participants).
In conclusion, increased access to health care does not always guarantee better health outcomes, 48 potentially due to low-quality services. 49Therefore, improving the quality of care is crucial, but measuring it can be challenging.Patient-reported measures offer a promising opportunity for assessment.However, without a nuanced approach to identify sources of systematic reporting error, using satisfaction ratings within value-based purchasing programmes may obscure poor-quality interpersonal care for marginalized patient populations.

Fig. 1 .
Fig. 1.Share of patients reporting receipt of disrespectful treatment and share reporting overall dissatisfaction with care, by caste, Odisha, India, 2020

Table 1 . Methods used to pre-test and pilot the Hospital Consumer Assessment of Health Providers and Systems survey, Odisha, India, 2020
bParticipants partook only in one step, that is each group was distinct.Bull World Health Organ 2024;102:509-520| doi: http://dx.doi.org/10.2471/BLT.24.290519

Table 3 . Characteristics of public hospital-based exit interviewees, Odisha, India, 2020
SD: standard deviation.a Values are no.(%) if not otherwise given.b No historically marginalized caste designation.c Languages spoken by less than 1% of respondents not included, hence the sum does not equal 100%.Note: we limited the sampling to public hospitals which are slated to be incorporated within the proposed value-based purchasing programme.Bull World Health Organ 2024;102:509-520| doi: http://dx.doi.org/10.2471/BLT.24.290519Liana Woskie et al.Patient satisfaction ratings, India

Understanding of care (λ: 1.1) e
A typical exclusion threshold for α coefficient is 0.70.The higher the α coefficient, the more the items have shared covariance and may measure the same underlying concept.Highly correlated items will also produce a high coefficient and can therefore be interpreted as a sign of redundancy.As we did not conduct the analysis to shorten the Hospital Consumer Assessment of Health Providers and Systems survey, we retain all items regardless of performance.b Model I represents the unadjusted results of a bivariate ordinary least square regression where overall satisfaction is the dependent variable and each row represents a different patient experience item posed to patient.c Adjusted for patient age, gender and clinical complexity.Eigenvalues (λ) shown for retained factors.Corresponding item categories are discrete and align with factor loadings most relevant to defining each factor's dimensionality.Note: we excluded two items (bathroom help and explanation of medicine side-effects) from this table because fewer than 50 respondents needed support with the bathroom or were prescribed medicines.Bull World Health Organ 2024;102:509-520| doi: http://dx.doi.org/10.2471/BLT.24.290519Liana Woskie et al.
a d Adjusted for Model II factors plus interview characteristics.e

Table 5 . Share of patients reporting receipt of disrespectful treatment and share reporting overall dissatisfaction with care, by caste, Odisha, India, 2020 Caste group Reporting disrespectful treatment Reporting dissatisfaction Spearman's ρ a (P)
a Spearman's ρ assessing the relationship between reporting disrespectful treatment and reporting dissatisfaction.b The general group refers to individuals with no historically marginalized class designation.Bull World Health Organ 2024;102:509-520| doi: http://dx.doi.org/10.2471/BLT.24.290519